Measuring Similarity of Large Software Systems Based on Source Code Correspondence

نویسندگان

  • Tetsuo Yamamoto
  • Makoto Matsushita
  • Toshihiro Kamiya
  • Katsuro Inoue
چکیده

It is an important and intriguing issue to know the quantitative similarity of large software systems. In this paper, a similarity metric between two sets of source code files based on the correspondence of overall source code lines is proposed. A Software similarity MeAsurement Tool SMAT was developed and applied to various versions of an operating system(BSD UNIX OS). The resulting similarity valuations clearly revealed the evolutionary history characteristics of the BSD UNIX Operating System. Also, as an extension of SMAT, a system-wide difference extraction tool was developed, which effectively compressed a set of source code files relative to a base set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

2D DC resistivity forward modeling based on the integral equation method and a comparison with the RES2DMOD results

A 2D forward modeling code for DC resistivity is developed based on the integral equation (IE) method. Here, a linear relation between model parameters and apparent resistivity values is proposed, although the resistivity modeling is generally a nonlinear problem. Two synthetic cases are considered for the numerical calculations and the results derived from IE code are compared with the RES2DMO...

متن کامل

Measuring the Similarity of Trajectories Using Fuzzy Theory

In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...

متن کامل

Measurement of Complexity and Comprehension of a Program Through a Cognitive Approach

The inherent complexity of the software systems creates problems in the software engineering industry. Numerous techniques have been designed to comprehend the fundamental characteristics of software systems. To understand the software, it is necessary to know about the complexity level of the source code. Cognitive informatics perform an important role for better understanding the complexity o...

متن کامل

Measuring the Entropy of Large Software Systems

entropy, software systems, structure, metric How does one measure a large software system to determine if it is "well-structured"? This report proposes a metric for doing just that, based on the concept of entropy from information theory. A tool that automatically extracts the metric from source code was built and used to compare two large software systems (each about 500,00 lines of source cod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005